Between Algorithms: A "Short Cut" Restricted Maximum Likelihood Procedure to Estimate Variance Components
نویسنده
چکیده
A restricted maximum likelihood procedure is described to estimate variance and covariance components in a multivariate mixed model when records are missing for some traits. The algorithm combines features of an expectat ionmaximization algorithm to estimate the within random effects components with the method of scoring to estimate the between random effects components. The procedure is computat ional ly less demanding per round of i teration than the method of scoring, although the number of iterates required to reach convergence is increased. A computing strategy is described for the example of estimating genetic parameters for first and later lactations of dairy cows. I N T R O D U C T I O N Restricted maximum likelihood (REML), developed by Patterson and Thompson (10), has become accepted as the preferred method to estimate genetic parameters, i.e., variance and covariance components, from animal breeding data. In spite of highly desirable properties of such estimates (2), REML has been put to little use in practice. Computat ional requirements are extensive, in particular for multivariate analyses, and practical analyses are often feasible only if some simplifications for specific models can be made (5, 14, 15). Under certain conditions, including that all information determining selection decisions is included in the model of analysis, maximum likelihood procedures, including REML, account for selection (1, 4, 9, 11). This is particularly relevant, for example, in estimating (co)variance components for first and later lactations of dairy cows. Meyer (5, 6) used a multivariate REML algorithm based on Fisher's method of scoring (MSC) to estimate genetic parameters for the first three lactations. Henderson (4) advocated an expectat ion-maximization (EM) algorithm for its comparative computat ional simplicity (per round of iteration) and for its proper ty of forcing estimates to be within the permissible parameter space. The objectives of this paper are: 1) to present a "short cut" (SHe) of the REML algorithm described by Meyer (5), 2) to outline a computing strategy for the SHe suitable for large data sets, and 3) to investigate the convergence behavior of the new procedure using simulated data. G E N E R A L M O D E L Consider a multivariate mixed model for q traits with one random factor. Let y, b, u, and e denote the vectors of observations, fixed effects, random effects, and residual errors (random), respectively. X and Z are the design matrices for fixed and random effects. The model of analysis can then be written as: y = Xb + Zu + e [1] with E(y) = Xb, E(u) = O, E (~ = O, ! V(u) = G, V(£) = R, and Coy (u ,e) = O. The mixed model equations (MME) pertaining to [1] are from Henderson (3): x . z 1 Z'R-1X ZIR--IZ + G-[ Z t R l y J [2] and the covariance matrix of the vector of observations is: Received September 25, 1985. V(y) = V = ZGZ' + R [31 1986 J Dairy Sci 69:1904--1916 1904 BETWEEN ALGORITHMS 1905 Let T = {tij ) and E = {eij ) denote the q x q matrices of variance and covariance components between and within the random effects, respectively. Define @ as the vector of parameters to be estimated, with elements O m (m t = 1 . . . . q [q+l ] ) standing in turn for tij and eij (i<j = 1 . . . . . q). With V m = DV/D®m, Dsi j = DG/Dtij , and Dwi j = ~R/~eij , [3] can be rewritten as: q(q+l) qq V = ~ VmO m = ~ . (ZDsijZ' tij + Dwijeij ) m=i l<J [41 RESTRICTED MAXIMUM LIKELIHOOD ESTIMATION Restricted maximum likelihood maximizes the l ikelihood of a vector of "error contrasts" independent of the fixed effects, Sy with SX = 0, and hence, E(Sy) = 0 (10). A suitable matrix S arises, for instance, in absorbing the fixed effects into the random effects in [3] (15), which gives: (ZISZ + G--1)~ = Z 'Sy [5] with : S = R 1 -R 1 X ( X ' R 1 X ) X ' R 1 [6] Assuming a multivariate normal distribution, differentiating the log l ikelihood of Sy for elements of ® and equating the derivatives to zero then yields a set of equations: y 'PVmPy = tr(PVm) for m = 1 . . . . . q (q+l ) [71 with: P = S S Z C Z' S [8] and: C = (Z' S Z + G 1 ) 1 [9] The EM algorithm uses these equations, [7], which involve "first derivatives", tr(PVm). Method of Scoring V is a generalized inverse of P, i.e., PVP = P. Hence, [7] are equivalent to equating quadratics in the data vector of their expectations (15): q(q+l) = E n=l E [yip Vm p y] = tr(P V P Vm) tr(P VmPVn)O n for m = 1 . . . . . q(q+l) [101 These equations involve "second derivatives", tr(P V m PVn)Collecting the quadratic forms in [ 10] in a vector d with elements dm = yIPVm Py [m=l . . . . . q ,(q+l)] and summarizing the second derivatives in a symmetric matr ix B with elements bran = tr(PVmPVn) then gives REML equations:
منابع مشابه
Estimation of Genotypic Correlation and Heritability of some of Traits in Faba Bean Genotypes Using Restricted Maximum Likelihood (REML)
In order to estimation genotypic correlation and heritability of some faba bean traits, 26 faba bean genotypes were evaluated in a randomized complete block design with three replications during 2014-16 growing seasons in Agricultural Research Sation of Borujerd located in Lorestan province, Iran. The restricted maximum likelihood (REML) was used to estimate the genotypic and phenotypic correla...
متن کاملWindowing Effects of Short Time Fourier Transform on Wideband Array Signal Processing Using Maximum Likelihood Estimation
During the last two decades, Maximum Likelihood estimation (ML) has been used to determine Direction Of Arrival (DOA) and signals propagated by the sources, using narrowband array signals. The algorithm fails in the case of wideband signals. As an attempt by the present study to overcome the problem, the array outputs are transformed into narrowband frequency bins, using short time Fourier tran...
متن کاملWindowing Effects of Short Time Fourier Transform on Wideband Array Signal Processing Using Maximum Likelihood Estimation
During the last two decades, Maximum Likelihood estimation (ML) has been used to determine Direction Of Arrival (DOA) and signals propagated by the sources, using narrowband array signals. The algorithm fails in the case of wideband signals. As an attempt by the present study to overcome the problem, the array outputs are transformed into narrowband frequency bins, using short time Fourier tran...
متن کاملUse of Restricted Maximum Likelihood Approach for Estimation of Genotypic Correlation and Heritability in Bread Wheat (Triticum aestivum L.) Under Water Deficit Stress
Wheat is mostly cultivated at rainfed condition in Iran, so, water deficit stress has much effect on yield reduction. Hence, breeding activities are necessary for introduction of wheat tolerant genotypes to water deficit stress. In order to estimate the heritability and genetic correlation between traits of 36 wheat genotypes, an experiment was conducted in two separate conditions (water stress...
متن کاملمقایسه مدلهای مختلف حیوانی در تخمین اجزای واریانس و فراسنجههای ژنتیکی وزن بدن گوسفند مهربان
Variance components and genetic parameters of body weight of Mehraban sheep were estimated by univariate and random regression models. This was done by using body weight records of 2746 Mehraban lambs related to flocks under supervision of the Agriculture Organization of the Hamadan province, collected between 1990 and 2005. In both methods, variance components estimates were obtained by restri...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1986